Implementing Multi-Perspective Context Matching for the SQuAD Task in TensorFlow
نویسنده
چکیده
The Multi-Perspective Context Matching model introduced by Wang, et al. [1] in 2016 is known to be capable of producing strong results in the SQuAD question answering task. As of this writing, it is tied for 3 place on the SQuAD leaderboard. Implementing the model efficiently is difficult in practice, and the original introduction paper leaves out some implementation details. The goal of this paper is to give practical details regarding my own implementation of this model and its performance, to accompany a code submission that is known to produce reasonably good scores. 1 The SQuAD task This is an introduction to SQuAD. Please see the original paper for more information on the dataset, task, and evaluation. 1 . 1 D a t a s e t The SQuAD question answering task was introduced by Rajpurkar, et al. [2] in 2016 to provide a challenging, large, high quality dataset for evaluating Reading Comprehension (RC). Those authors found existing datasets to be either too small or unreflective of genuine human reading comprehension, and intended for SQuAD to be both large and qualitatively representative of real human understanding. The dataset consists of 100,000+ questions with one or more ground truth answers per question. Each ground truth correct answer is a span of text taken verbatim from the corresponding reading passage the question is associated with. 1 . 2 Ta s k The task is to take pairs of context reading passages and questions, and return the span of text from the context that answers the question. 1 . 3 E v a l u a t i o n Success on the task is evaluated in two ways. F1: This is the average overlap between predictions and correct answers. The F1 is calculated for each prediction/correct answer pair for the answers under each question, the maximum is taken per question, then the average is computed over the per-question max values. 1 SQuAD homepage with leaderboard: https://rajpurkar.github.io/SQuAD-explorer/
منابع مشابه
Assignment 4: Question Answering on the SQuAD Dataset with Part-of-speech Tagging
This research applies deep learning with bi-LSTMs to train a model that responds to queries on the Stanford Question Answering Dataset (SQuAD). The model design was motivated by Wang et. al.’s December 2016 IBM Research paper on multi-perspective context matching for machine comprehension. Using TensorFlow, we implemented a multi-layer neural network architecture utilizing bi-directional LSTMs ...
متن کاملImplementation and New Variants Exploration of the Multi-Perspective Context Matching Deep Neural Network Model for Machine Comprehension
This project explores the multi-perspective context matching method for the task of reading comprehension using the SQuAD data set. The original six layer model presents an interesting system for exploring deep learning architectures and their implementations on Tensorflow.The first step was to design an efficient implementation of this complex model on Tensorflow. The second step, and the aim ...
متن کاملQuestion Answering with SQuAD: Variations on Multi-Perspective Context Matching
We implement multi-perspective context matching for the task of questionanswering on the SQuAD dataset and explore a variety of modifications to this core architecture. In our first modification, we compare the performance of GRUs with that of LSTMs in the original model. Next we attempt to predict the answer’s start index and length rather than its start and end indices. Finally, we introduce ...
متن کاملReading Comprehension with SQuAD
The SQuAD dataset provides a reading comprehension task, with (question, answer, context) pairs. This paper attempts to develop a model to predict the answer for a given question, context pair. In a model inspired by ”Multi-Perspective Context Matching for Machine Comprehension” by Wang et al, this paper achieves 0.47 F1 score on a validation set.
متن کاملMulti-Perspective Context Matching for SQuAD Dataset
Question answering is an important task in machine comprehension. The new SQuAD dataset allows us to deploy recent NLP deep learning techniques and train an end-to-end system to predict the start and end position of the answer in the given context, instead of precisely selecting the words of the correct answer. We propose to use combine bi-directional LSTM (BiLSTM) and context matching to devel...
متن کامل